W-TSV: Weighted topological signature vector for lexicon reduction in handwritten Arabic documents
نویسندگان
چکیده
This paper proposes a holistic lexicon-reduction method for ancient and modern handwritten Arabic documents. The word shape is represented by the weighted topological signature vector (W-TSV), which encodes graph data into a low-dimensional vector space. Three directed acyclic graph (DAG) representations are proposed for Arabic word shapes, based on topological and geometrical features. Lexicon reduction is achieved by a nearest neighbors search in the W-TSV space. The proposed framework has been tested on the IFN/ENIT and the Ibn Sina databases, achieving respectively a degree of reduction of 83.5% and 92.9% for an accuracy of reduction of 90%.
منابع مشابه
Holistic Approach for Classifying and Retrieving Personal Arabic Handwritten Documents
This paper presents a novel holistic technique for classifying and retrieving Arabic handwritten text documents. The retrieval of Arabic handwritten documents is performed in several steps. First, the Arabic handwritten document images are segmented into words, and then each word is segmented into its connected parts. Second, several features are extracted from these connected parts and then co...
متن کاملArabic word descriptor for handwritten word indexing and lexicon reduction
Word recognition systems use a lexicon to guide the recognition process in order to improve the recognition rate. However, as the lexicon grows, the computation time increases. In this paper, we present the Arabic word descriptor (AWD) for Arabic word shape indexing and lexicon reduction in handwritten documents. It is formed in two stages. First, the structural descriptor (SD) is computed for ...
متن کاملیک روش دو مرحلهای برای بازشناسی کلمات دستنوشته فارسی به کمک بلوکبندی تطبیقی گرادیان تصویر
This paper presented a two step method for offline handwritten Farsi word recognition. In first step, in order to improve the recognition accuracy and speed, an algorithm proposed for initial eliminating lexicon entries unlikely to match the input image. For lexicon reduction, the words of lexicon are clustered using ISOCLUS and Hierarchal clustering algorithm. Clustering is based on the featur...
متن کاملClassification of Personal Arabic Handwritten Documents
This paper presents a novel holistic technique for classifying Arabic handwritten text documents. The classification of Arabic handwritten documents is performed in several steps. First, the Arabic handwritten document images are segmented into words, and then each word is segmented into its connected parts. Second, several structural and statistical features are extracted from these connected ...
متن کاملUse of the Shearlet Transform and Transfer Learning in Offline Handwritten Signature Verification and Recognition
Despite the growing growth of technology, handwritten signature has been selected as the first option between biometrics by users. In this paper, a new methodology for offline handwritten signature verification and recognition based on the Shearlet transform and transfer learning is proposed. Since, a large percentage of handwritten signatures are composed of curves and the performance of a sig...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Pattern Recognition
دوره 45 شماره
صفحات -
تاریخ انتشار 2012